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@ Plant ubiqultln promoter system. 

A DNA segment from the upstream untranscribed region 
Fa maize ubiquitin gene is disclosed. This ubiquitin promoter 
region, which comprises heat shock consensus elements, 
initiates and regulates the transcription of genes placed under 
its control. Recombinant DNA molecules are also described in 
which a ubiquitin promoter is combined with a plant expressible 
structural gene for regulated expression of the structural gene 
and for regulated control of expression when stressed with 
elevated temperatures. Such recombinant DNA molecules are 
introduced into, plant tissue so that the promoter/structural 
gene combination is expressed. 
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Description 



PLANT UBIQUITIN PROMOTER SYSTEM 



FIELD OF THE INVENTION 

The invention is in the area of plant molecular 5 
biology and concerns plant genetic engineering by 
recombinant DNA technology. The identification and 
characterization of a segment of DNA from the 
upstream nontranscribed region of a plant ubiquitin 
gene are described. This segment is capable of 10 
initiating and driving the transcription of nearby plant 
expressible genes in recombinant DNA-containing 
tissue from both monocotyledonous and dicotyle- 
donous plants. The described DNA segment will 
enable the selective expression and regulation of 15 
desired structural genes in plant tissue. 

BACKGROUND OF THE INVENTION 

Ubiquitin is an 8.5 kDa protein found in eukaryotic 
cells in either the free, monomeric state or covalently 20 
joined to various cytoplasmic, nuclear or membrane 
proteins. This protein contains 76 amino acid 
residues and its amino acid sequence is conserved 
to an unusually high extent. The sequence of 
ubiquitin is identical between species as diverse as 25 
human, cow, Mediterranean fruit fly, Xenopus and 
chicken (U. Bond and M. Schlesinger (1985) Mol. 
Cell. Biol. 5:949-956). Yeast and human ubiquitin 
differ by only three different amino acids (K. 
Ozkaynak et al. (1984) Nature 312: 663-666), while 30 
plant ubiquitin differs from that of yeast by two amino 
acids. Based on this two or three amino acid 
difference in sequence, there appear to be at least 3 
types of ubiquitin - animal, plant and yeast. 

Ubiquitin is found in three major cellular compart- 35 
ments -the cytoplasmic membrane, the cytoplasm 
and the nucleus. This protein is required for 
ATP-dependent degradation of intracellular proteins, 
a non-lysosomal pathway to eliminate from the cell 
those proteins that are damaged or abnormal as well 40 
as normal proteins having a short half-life (A. 
Hershko et al. (1984) Proc. Natl. Acad. Sci. USA 
81 :1619-1623; D. Finley et al. (1985) Trends Biol. Sci. 
10:343-347). Ubiquitin binds to a target protein, 
tagging it for degradation. The covalent attachment 45 
is through isopeptide linkages between the carbox- 
ylterminus (glycine) in ubiquitin and the e-amino 
group of lysyl side chains in the target proteins. 

Ubiquitin also plays a role in the cellular response 
to stresses, such as heat shock and increase in 50 
metal (arsenite) concentration (D. Finley et al. (1985) 
supra ). Most living cells respond to stress (for 
example, exposure to temperatures a few degrees 
above normal physiological temperatures, or to 
elevated concentrations of heavy metals, ethanol, 55 
oxidants and amino acid analogs) by activating a 
small set of genes to selectively synthesize stress 
proteins, also called heat shock proteins. In most 
organisms these stress proteins were found to have 
subunrt molecular weights of 89, 70 and 23 kDa (U. 60 
Bond and M. Schlesinger (1985) supra ). Ubiquitin, 
with a molecular weight f approximately 8.5 kDa, 
also responds to str ss t since in differ nt species 



(yeast, mouse, gerbil and chicken embryo fibro- 
blasts) the levels of ubiquitin mRNA and ubiquitin 
protein increase as a result of different stress 
conditions. 

In eukaryotic systems the expression of genes is 
directed by a region of the DNA sequence called the 
promoter. In general, the promoter is considered to 
be that portion of the DNA, upstream from the 
coding region, that contains the binding site for RNA 
polymerase II and initiates transcription of the DNA. 
The promoter region also comprises other elements 
that act as regulators of gene expression. These 
include a TATA box consensus sequence in the 
vicinity of about -30, and often a CAAT box 
consensus sequence at about -75 bp 5' relative to 
the transcription state site, or cap site, which is 
defined as +1 (R. Breathnach and P. Chambon 
(1981) Ann. Rev. Biochem. 50:349-383; J. Messing et 
al. (1983) in Genetic Engineering of Plants , eds. T. 
Kosuge, CP. Meredith and A. Hollaender, 
pp. 211-227). In plants the CAAT box may be 
substituted by the AGGA box (J. Messing et al. 
(1983) supra ). Other regulatory elements that may 
be present are those that affect gene expression in 
response to environmental stimuli, such as illumina- 
tion or nutrient availability, or to adverse conditions, 
such as heat shock, anaerobiosis or the presence of 
heavy metal. In addition, there may be present DNA 
sequences which control gene expression during 
development, or in a tissue-specific fashion. Other 
regulatory elements that have been found are the 
enhancers (in animal systems) or the upstream 
activating sequences (in yeast), that act to elevate 
the overall expression of nearby genes in a manner 
that is independent of position and orientation with 
respect to the nearby gene. Sequences homologous 
to the animal enhancer core consensus sequence, 
5'-G GTGTGG AAA(orTTT) G-3' , have been described 
in plants, for example, in the pea legumin gene at 
about position -180 relative to the transcription state 
site (G. Lycett et al. (1984) Nucleic Acids Res. 
12:4493-4506) and in the maize Adh1 and Adh2 
genes at about -200 and -170 bp, respectively, from 
the transcription state site. In general, promoters are 
found 5', or upstream, relative to the start of the 
coding region of the corresponding gene and this 
promoter region, comprising all the ancillary regula- 
tory elements, may contain between 100 and 1000 or 
more nucleotides. 

Of the regulatory elements controlling gene 
expression, the heat shock element is perhaps one 
of the most widely studied. Although the universality 
of cellular response to heat shock has been known 
for almost a decade, very little is known yet about the 
function of the heat shock proteins selectively 
synthesized by the stressed cell. The induction of 
stress protein synthesis occurs at a transcription 
level and the response has been found to be similar 
in bacteria, fungi, insects and mammals (E. Craig 
(1985) CTC Crit. Rev. Biochem. 18:239-280). In 
addition to the synthesis and accumulation of the 
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classic heat shock proteins in response to stress, 
cells that are stress d also synthesize proteases 
and ubiquitin. In E. coli , a 94 kDa enzyme that has an 
ATP-dependent prot olytic activity is encoded by 
the Ion ( cap R) gene whose expression is under 
control of the heat shock regulon (E. Ozkaynak et al. 
(1984) Nature 312:663-666). In chicken embryo 
fibroblasts (U. Bond et al. (1985) Mol. Cell. Biol. 
5:949-956) the ubiquitin mRNA level increased five 
fold after heat shock or after exposure to 50 u.M 
arsenite. Each mRNA comprises tandemly repeated 
identical polypeptides which, upon translation as a 
poiyubiquitin molecule, gives rise to multiple ubi- 
quitin molecules, offering a distinctive mechanism 
for amplifying genetic information. This elevated level 
of ubiquitin mRNA does not persist during the 
recovery phase after heat shock, indicating a 
transient role for free ubiquitin during the stress 
response. 

It has been postulated (J. Anathan et al. (1986) 
Science 232:522-524) that metabolic stresses that 
trigger the activation of heat shock protein genes act 
through a common mechanism. The metabolic 
stresses ihst activate heat shock qshss cause 
denaturation of intracellular proteins; the accumula- 
tion of abnormal proteins acts as a signal to activate 
heat shock genes. A role for ubiquitin in targeting 
abnormal proteins for degradation, as well as for 
different proteolytic enzymes, would be compatible 
with such a model of heat shock protein gene 
regulation. 

Most of the early work on heat shock genes was 
done with Drosophila species. In particular, the 
Drosophila hsp70 gene was used widely in recombi- 
nant studies. In homologous systems, the Droso- 
phila hsp70 gene was fused to the E. coli p-galacto- 
sidase structural gene to allow the activity of the 
hybrid gene to be distinguished from the five 
resident hsp70 heat shock genes in the recipient 
Drosophila. Drosophila heat shock genes were also 
introduced into heterologous systems, e.g.. in 
monkey COS cells and mouse cells (H. Pelham 
(1982) Cell 30:517-528). Regulation by heat shock 
was observed in the hybrid hsp70-lac Z gene which 
was integrated into the Drosophila germ line and into 
which a 7 kb E.coli p-galactosidase DNA fragment 
was inserted into the middle of the hsp70 structural 
gene. The resultant p-galactosidase activity in the 
transformants was shown (J. Us et al. (1983) Cell 
35:403-410) to be regulated by heat shock. 

The DNA sequence conferring heat shock re- 
sponse was identified by deletion analysis of the 
Drosophila hsp70 heat shock promoter to be 
5'-CTGGAAT__TTCTAGA-3' (H. Pelham et al. (1982) 
in Heat Shock From Bacteria to Man , Cold Spring 
Harbor Laboratory, pp. 43-48) and is generally 
located in the -66 through -47 region of the gene or 
approximately 26 bases upstream of the TATA box. It 
was further demonstrated that a chemically syn- 
thesized copy of this element, when placed up- 
stream of the TATA box of the herpes virus 
thymidine kinase gene in place f the normal 
upstream promoter element, was sufficient to conf r 
heat inducibiltty upon the thymidine kinase gene in 
m nkey COS cells and in Xenopus oocytes. (The 



thymidine kinase gene is normally not heat induc- 
ible.) These heat shock sequenc s interact with heat 
shock specific transcription factor(s) which allow 
th induction of heat shock proteins (C. Park r t al. 

5 (1984) Cell 37:273-283). Inducers of heat shock 
genes could be factors that alter (decrease) the 
concentration of heat shock proteins within the cell 
and, thus, control the transcription and translation of 
heat shock genes. 

10 In higher plants, the stress response was demon- 
strated by increased protein synthesis in response 
to heat shock in soybean, pea, millet, corn, sun- 
flower, cotton and wheat (T. Barnett et al. (1980) 
Dev. Genet. 1:331-340; J. Key et al. (1981) Proc. Natl. 

15 Acad. Sci. USA 78:3526-3530). The major differen- 
ces in heat shock response seen among plant 
species are: (a) the amount of total prot in syn- 
thesized in response to stress, (b) the size distribu- 
tion of the different proteins synthesized, (c) the 
20 optimum temperature of induction of heat shock 
proteins and (d) the lethal (breakpoint) temperatur . 
High molecular weight proteins are found to b 
electrophoretically similar among different species. 

Thr> loui mnlo^i ilor \/u£sinht / 1 l/Ho^ hoat ehnr!/ 

'"V .w»» . . . v . \ ■ — — — * * 

25 proteins show more electrophoretic heterogeneity 
between species. In plants, the higher molecular 
weight proteins resemble those produced in Droso- 
phila. There is a marked difference, however, in the 
complexity of the low molecular weight heat shock 

30 proteins between plants and Drosophila. Four heat 
shock proteins, 22, 34, 36 and 27 kDa, are 
synthesized in Drosophila, whereas soybean pro- 
duces over 20 heat shock proteins having molecular 
weights in the range of 15-18 kDa. The low molecular 

35 weight protein genes in soybeans are the most 
actively expressed and coordinatety regulated genes 
under heat shock conditions (F. Schoffl et al. (1982) 
J. Mol. Appl. Genet. 1:301-314). 

Key et al. (EPO Application No. 85302593.0, filed 

40 April 12, 1985) have studied the promoter region of 
plant heat shock genes. Four soybean heat shock 
genes (three genes coding for 15-18 kDa heat shock 
genes (three genes coding for 15-18 kDa heat shock 
proteins and one genes coding for a 17.3 kDa heat 

45 shock protein) were cloned and sequenced. Th 
coding sequences and flanking sequences of the 
four heat shock genes were determined. The 
promoter regions of these four genes were sub- 
cloned, linked to a T-DNA shuttle vector and 

50 transferred into Agrobacterium tumefaciens . One of 
the recombinant clones of a soybean heat shock 
gene coding for a 15-18 kDa protein contained an 
open reading frame of 462 nucleotides and a 291 
nucleotide promoter region upstream of the ATG 

55 translation initiation codon. The promoter included 
the TATA box. the CAAT box. the transcription 
initiation site and a heat shock consensus sequence 
131-144 nucleotides upstream of the ATG translation 
start codon with the sequence 

60 5'-CT GAA TTC AG-3'. Only three of the four 

clones showed substantial homology in the pro- 
mot r region, but there w re strong similarities 
between th heat shock consensus sequences of alt 
four clones. Significantly, th coding sequence, the 

65 upstream promoter region and the downstream 
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flanking region of the four soybean heat shock 
genes had almost no resemblance to the corre- 
sponding regions of Drosophila heat shock g n s. 
Although there were similarities between the con- 
sensus sequence of the promoter region from 
Drosophila and soybean heat shock genes, the 
promoter regions of soybean heat shock genes did 
not possess the inverted repeat sequences charac- 
teristic of Drosophila genes. 

The promoter region from the soybean heat shock 
genes was used to activate a soybean gene and a 
foreign gene (one normally not found in soybean) 
and to show regulation of the response by stress 
(Key et a)., EPO Application No. 85302593.0, filed 
April 12, 1985). The promoter was isolated from the 
soybean SB 13 heat shock gene as a DNA fragment 
extending 65 bp downstream from the start of 
transcription to include a major portion of the 
untranslated leader sequence but not the start 
codon for translation. A P-galactosidase gene was 
placed under the control of the heat shock promoter 
within the T-DNA of the Ti-plasmid in a stable form 
within A. tumefaciens , and then was transferred to a 
plant or plant cell culture. The actuality of DNA 
transfer was recognized by the expression of the 
P-galactosidase gene as the production of a blue 
color after heat treatment in a medium containing 
the 5-bromo-4-chloro-3-indolyl-p-D-galactosidase 
substrate molecule (M. Rose et al. (1981) Proc. Natl. 
Acad. Sci. USA 78:2460-2464). 

Experimentation with cross expression wherein a 
gene from one plant species is examined for 
expression in a different species adds a further 
dimension to the understanding of specific function. 
These experiments may embody the insertion of a 
gene under the control of its own promoter or of a 
gene artificially fused tb a different or unnatural 
promoter. In 1983 Murai et al. (Science 222:476-482) 
obtained expression of the phaseolin gene from 
Phaseolus vulgaris L. in sunflower (Helianthus) 
tissue under two sets of conditions: (i) when the 
Phaseolin gene was under the control of its own 
promoter and (ii) when the gene was spliced to, and 
under the control of a T-DNA promoter. In subse- 
quent experiments it was shown that the phaseolin 
structural gene under the control of its natural 
promoter could be expressed in tobacco and that 
the tissue-specific expression in the heterologous 
host (tobacco) was similar to that in the native host 
(bean) (C. Sengupta-Gopalen et al. (1985) Proc. 
Natl. Acad. Sci. USA 82:3320-3324). 

In later experiments (J. Jones et al. (1985) EMBO 
J. 4:2411-2418) the expression of the octopine 
synthetase gene (ocs) was described in both 
regenerated transformed homologous (petunia) and 
heterologous (tobacco) plants. In this study the ocs 
gene was fused to the promoter of a petunia 
chlorophyll a/b binding protein. Cross-expression 
was obtained by W. Gurley et al. (1986) Mol. Cell. 
Biol. 6:559-565; and Key et al., EPO Application 
No. 85302593.0, filed April 12, 1985, who reported 
strong transcription In sunflower tumor tissue of a 
soybean heat shock gene under control of its own 
prom t r. In this case functional activity was 
measured as the correct thermal induction re- 
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sponse. 

The first evidence for transcription initiated from a 
monocotyledon promoter in a dicotyledon host plant 
was published by Matzke et al. (1984) EMBO J. 
5 3:1525-1531. These workers cloned the maize zein 
Z4 gene and introduced it on aTi-derived vector into 
sunflower stemlets. The ensuing zein mRNA could 
then be translated In a wheat germ system but not in 
the transformed sunflower calli. 

10 In a later study the wheat gene whAB1.6 encoding 
the major chlorophyll a/b binding protein was cloned 
into a T-DNA-containing vector and transferred to 
both petunia and tobacco (G. Lamppa et al. (1985) 
Nature 316:750-752). Expression was obtained in 

15 both the monocotyledon and dicotyledon hosts and 
was determined to be light-induced and tissue-spe- 
cific. In a more recent study, Rochester et al. (1986) 
EMBO J. 5:451-458) obtained expression of the 
maize heat shock hsp70 gene in transgenic petunia. 

20 The maize hsp70 mRNA was synthesized only in 
response to thermal stress. So far, these three 
studies constitute the total number of published 
reports describing successful expression of mon- 
ocot genes in transgenic dicot plants. However, 

25 there are also negative reports describing minimal or 
no expression of maize alcohol dehydrogenase gene 
in tobacco hosts (Llewellyn et al. (1985) in Molecular 
Form and Function of the Plant Genome , L. van 
Vloten-Doting, G.S. Groot andT. Hall (eds.), Plenum 

30 Publishing Corp., pp. 593-608; J.G. Ellis et a!. (1987) 
EMBO J. 6:11-16), suggesting a possible inherent 
species-specific difference between monocot and 
dicot promoters. 
The heat shock response is believed to provide 

35 thermal protection or thermotolerance to otherwise 
nonpermissive temperatures (M. Schlesinger et al. 
(1982) in Heat Shock from Bacteria to Man , Cold 
Spring Harbor Laboratory, Cold Spring Harbor, New 
York, p. 329). A permissive heat shock temperature 

40 is a temperature which is high enough to induce the 
heat shock response but not high enough to be 
lethal. Thermotolerance in plant seedlings can be 
attained by different treatment regimes: (a) a 1 to 2 
hour exposure to continuous heat shock at 40° C 

45 followed by a 45° C incubation, (b) a 30 minute heat 
shock at 40° C followed by 2 to 3 hours at 28° C prior 
to the shift to 45° C, (c) a 10 minute heat shock at 
45° C followed by about 2 hours at 28° C prior to the 
shift to 45° C and (d) treatment of seedlings with 50 

50 nM arsenlte at 28° C for 3 hours or more prior to the 
shift to 45° C. During the pretreatment prior to 
incubation at the potentially lethal temperature, heat 
shock proteins are synthesized and accumulated. 
Also, heat shock mRNA and protein syntheses 

55 occur at 45° C, if the plant seedling is preconditioned 
as described above. When the temperature is shifted 
back to physiological levels (e.g., 28° C). normal 
transcription and translation are resumed and after 3 
to 4 hours at normal temperature, there is no longer 

60 detectable synthesis of heat shock proteins (J. Key 
et al. (1981) Proc. Natl. Acad. Sci. USA 
78:3526-3530; M. Schlesinger et al. (1982) Trends 
Blochem. Sci. 1^:222-225). The heat shock prot ins 
that w r synthesized during the 40° C heat shock 

65 treatment are very stable and are not immediately 
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degraded. 

Although ubiquitin is regulated in response to 
environm ntal stress, including h at shock, the 
r gulation of ubiquitin transcription differs from that 
of classical heat shock protein transcripts. Both 
ubiquitin and heat shock protein mRNA levels are 
elevated in response to cellular stress. However, 
whereas classical heat shock proteins accumulate 
during heat shock and persist during the recovery 
phase, ubiquitin mRNAs accumulated during heat 
shock are rapidly degraded within hours after stress 
treatment. This unstable mRNA transcript suggests 
a specialized but transient role for ubiquitin during 
heat shock, and implicates a unique DNA sequence 
in the ubiquitin gene promoter region, specifying 
specialized regulatory control during cellular re- 
sponse to stress. 

SUMMARY OF THE INVENTION 

The primary object of this invention is to provide 
novel DNA segments and constructions comprising 
a regulatory promoter system which will enable 
those skilled in the art to selectively express 

»»» iwf\t *+*yr\**e> ir> nlpnl ticcno Thn nrftmntor 

Oil UUIUI Hi y <_» I IW w ■■ • |*rt&l *» ktwwwv. . . .w f- ■ - — • 

comprises the DNA sequences from the 5' nontran- 
scribed regions of plant ubiquitin genes that initiate 
and regulate the transcription of genes placed under 
its control. In its preferred embodiment, the pro- 
moter sequence is derived from the upstream region 
of the ubiquitin gene from maize. 

The isolation and characterization of a promoter 
which is active in plants to control and regulate the 
expression of a downstream gene is described in the 
present work. This DNA sequence is found as a 
naturally occurring region upstream of the ubiquitin 
structural gene isolated from a maize genomic 
library. The transcription start site or cap site as 
determined by S1 nuclease mapping is designated 
as base 1 and the sequences embodied within about 
899 bases 5' of the transcription start site plus about 
1093 bases 3' of the cap site but 5' of the translation 
start site constitute the ubiquitin promoter. Located 
within this approximately 2kb promoter region are a 
TATA box (-30), two overlapping heat shock consen- 
sus elements (-204 and -214), an 83 nucleotide 
leader sequence immediately adjacent to the tran- 
scription start site and an intron extending from 
base 84 to base 1093. 

A further object of this invention is to provide a 
recombinant DNA molecule comprising a plant 
expressible promoter and a plant expressible struc- 
tural gene, wherein the structural gene is placed 
under the regulatory control of ail transcription 
initiating and activating elements of the promoter. In 
particular, the plant ubiquitin promoter can be 
combined with a variety of DNA sequences, typically 
structural genes, to provide DNA constructions for 
regulated transcription and translation of said DNA 
sequences and which will allow for regulated control 
of expression when stressed with elevated tempera- 
tures. 

Such recombinant DNA molecules are introduced 
into plant tissue so that the promoter/structural 
gene combination is expressed. It is contemplated 
that the method of the present invention is generally 



applicable to the expression of structural genes in 
both monocotyiedonous and dicotyledonous plants. 

BRIEF DESCRIPTION OF THE FIGURES 

5 

Figure 1 is an analysis of a maize ubiquitin 
genomic clone. (A) Restriction map of ubiquitin 
gene, 7.2b1. (B) Restriction map of two sub- 
cloned Pstl fragments of ubiquitin gene 1. (C) 

10 Schematic representation of maize ubiquitin 

gene 1 organization. The 5' untranslated exon is 
indicated by the open box and the tandem 
ubiquitin coding regions are indicated by the 
numbered boxes. 

15 Figure 2 documents the DNA sequence and 

the deduced amino acid sequence of ubiquitin 
gene 1 . The start of transcription as determined 
by S1 nuclease mapping is denoted as base 1. 
Sequences representing the putative "TATA" 

20 box (-30) and the overlapping heat shock 

consensus sequences (-214 and -204) are 
underlined. The intron extends from base 84 to 
base 1093 and the polyubiquitin protein coding 
sequence extends from base 1094 to 2693. 

25 Figure 3 demonstrates that all seven of the 

ubiquitin coding repeats encode an identical 
amino acid sequence. The nucleotide sequence 
of the seven repeats is shown aligned under the 
derived amino acid sequence. An additional 

30 77th amino acid, glutamine, is present in the 7th 

repeat preceding the stop codon. A polyadeny- 
lation signal, AATAAT, is present in the 3' 
untranslated region, 113bp from the stop 
codon. 

35 Figure 4 is a diagrammatic presentation of 

the procedure used for the construction of the 
maize ubiquitin promoter region-chlorampheni- 
col acetyl transferase (CAT) gene fusion. 

Figure 5 presents an assay for the ubiquitin 

40 promoter. CaMV-CAT, cauliflower mosaic virus 

35S promoter -CAT gene fusion; UBQ-CAT, 
maize ubiquitin promoter - CAT gene fusion. 

DETAILED DESCRIPTION OF THE INVENTION 

45 The following definitions are provided in order to 
remove ambiguities as to the intent or scope of their 
usage in the specification and claims. 

Expression refers to the transcription and/or 
translation of a structural gene. 

50 Promoter refers to the nucleotide sequenc s at 
the 5' end of a structural gene which direct the 
initiation of transcription. Promoter sequences are 
necessary, but not always sufficient, to drive th 
expression of a downstream gene. In general, 

55 eukaryotic promoters include a characteristic DNA 
sequence homologous to the consensus 5'-TA- 
TAAT-3' (TATA) box about 10^30 bp 5' to the 
transcription start (cap) site, which, by convention, 
is numbered + 1. Bases 3' to the cap site are given 

60 positive numbers, whereas bases 5' to the cap site 
receive negative numbers, reflecting their distance 
from the cap site. Another promoter component, the 
CAAT box, is often found about 30 to 70 bp 5' to the 
TATA box and has homology to the canonical form 

65 5'-CCAAT-3' (R. Breathnach and P. Chambon (1981) 
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Ann. Rev. Biochem. 50:349-383). In plants the CAAT 
box is sometimes r placed by a sequence known as 
the AGGA box, a region having adenine residues 
symmetrically flanking th triplet G(orT)NG (J. 
Messing et af. (1983), in Genetic Engineering of 
Plants , T. Kosuge et al. (eds.), Plenum Press, 
pp. 21 1-227). Other sequences conferring regulatory 
influences on transcription can be found within the 
promoter region and extending as far as 1000 bp or 
more from the cap site. 

Regulatory Control refers to the modulation of 
gene expression induced by DNA sequence ele- 
ments located primarily, but not exclusively, up- 
stream of (5' to) the transcription start site. 
Regulation may result in an all-or-nothing response 
to environmental stimuli, or it may result in variations 
in the level of gene expression. In this invention, the 
heat shock regulatory elements function to enhance 
transiently the level of downstream gene expression 
in response to sudden temperature elevation. 

Placing a structural gene under the regulatory 
control of a promoter or a regulatory element means 
positioning the structural gene such that the 
expression of the gene is controlled by these 
sequences. In general, promoters are found posi- 
tioned 5' (upstream) to the genes that they control. 
Thus, in the construction of heterologous promoter/ 
structural gene combinations, the promoter is 
preferably positioned upstream to the gene and at a 
distance from the transcription start site that 
approximates the distance between the promoter 
and the gene it controls in its natural setting. As is 
known in the art, some variation in this distance can 
be tolerated without loss of the promoter function. 
Similarly, the preferred positioning of a regulatory 
element with respect to a heterologous gene placed 
under is control reflects its natural position relative 
to the structural gene it naturally regulates. Again, as 
is known in the art, some variation in this distance 
can be accommodated. 

Promoter function during expression of a structu- 
ral gene under its regulatory control can be tested at 
the transcriptional stage using DNA-RNA hybridiza- 
tion assays. ("Northern" blots) and at the transla- 
tional stage using specific functional assays for the 
protein synthesized (for example, by enzymatic 
activity or by immunoassay of the protein). 

Structural gene is that portion of a gene compris- 
ing DNA segment encoding a protein, polypeptide or 
a portion thereof, and excluding the 5' sequence 
which drives the initiation of transcription. The 
structural gene may be one which is normally found 
in the cell or one which is not normally found in the 
cellular location wherein it is introduced, in which 
case it is termed a heterologous gene . A heterolo- 
gous gene may be derived in whole or in part from 
any source known to the art, including a bacterial 
genome or episome, eukaryotic, nuclear or plasmid 
DNA, cDNA, viral DNA or chemically synthesized 
DNA. A structural gene may contain one or more 
modifications in either the coding or the untrans- 
lated regions which could affect the biological 
activity or the chemical structure of the expression 
product, the rate of expression or the manner of 
expression control. Such modifications include, but 



are not limited to, mutations, insertions, deletions 
and substitutions of one or more nucleotides. The 
structural gene may constitute an unint rrupted 
coding sequence or it may include one or more 
5 introns, bound by the appropriate splice junctions. 
The structural gene may be a composite of seg- 
ments derived from a plurality of sources, naturally 
occurring or synthetic. The structural gene may also 
encode a fusion protein. It is contemplated that the 

10 introduction into plant tissue of recombinant DNA 
molecules containing the promoter/structural gene/ 
polyadenylation signal complex will include con- 
structions wherein the structural gene and its 
promoter are each derived from different plant 

15 species. 

Plant Ubiquitin Regulatory System refers to the 
approximately 1 kb nucleotide sequence 5' to the 
translation start site of the maize ubiquitin gene and 
comprises sequences that direct initiation of tran- 

20 script ion, regulation of transcription, control of 
expression level, induction of stress genes and 
enhancement of expression in response to stress. 
The regulatory system, comprising both promoter 
and regulatory functions, is the DNA sequence 

25 providing regulatory control or modulation of gene 
expression. A structural gene placed under the 
regulatory control of the plant ubiquitin regulatory 
system means that a structural gene is positioned 
such that the regulated expression of the gene is 

30 controlled by the sequences comprising the ubi- 
quitin regulatory system. 

Polyadenylation signal refers to any nucleic acid 
sequence capable of effecting mRNA processing, 
usually characterized by the addition of polyadenylic 

35 acid tracts to the 3'-ends of the mRNA precursors. 
The polyadenylation signal DNA segment may itself 
be a composite of segments derived from several 
sources, naturally occurring or synthetic, and may 
be from a genomic DNA or an mRNA-derived cDNA. 

40 Polyadenylation signals are commonly recognized 
by the presence of homology to the canonical form 
5'-AATAA-3\ although variation of distance, partial 
"readthrough," and multiple tandem canonical se- 
quences are not uncommon (J. Messing et al. 

45 supra ). It should be recognized that a canonical 
"polyadenylation signal" may in fact cause transcrip- 
tional termination and not polyadenylation per se (C. 
Montell et al. (1983) Nature 305:600-605). 

Plant tissue includes differentiated and undif- 

50 ferentiated tissues of plants, including, but not 
limited to roots, shoots, leaves, pollen, seeds, tumor 
tissue and various forms of cells in culture, such as 
single ceils, protoplasts, embryos and callus tissue. 
The plant tissue may be in planta or in organ, tissue 

55 or cell culture. 

Homology , as used herein, refers to identity or 
near identity of nucleotide and/or amino acid 
sequences. As is understood in the art, nucleotide 
mismatches can occur at the third or wobble base in 

60 the codon without causing amino acid substitutions 
in the final polypeptide sequence. Also, minor 
nucleotide modifications (e.g., substitutions, inser- 
tions or deletions) in certain regions of the gene 
sequence can be tolerated and considered insignifi- 

65 cant whenever such modifications result in changes 



BNSDOCID: <EP 0342926A2_/_> 



t 



11 



EP 0 342 926 A2 



12 



in amino acid sequence that do not alter the 
functionality of the final product. It has been shown 
that chemically synthesized copies of whole, or parts 
of, gen sequences can replace the corresponding 
regions in the natural gene without loss of gene 
function. Homologs of specific DNA sequences may 
be identified by those skilled in the art using the test 
of cross-hybridization of nucleic acids under condi- 
tions of stringency as is well understood In the art 
(as described in Hames and Higgens (eds.) (1985) 
Nucleic Acid Hybridisation , IRL Press, Oxford, UK). 
Extent of homology is often measured in terms of 
percentage of identity between the sequences 
compared. Thus, in this disclosure it will be 
understood that minor sequence variation can exist 
within homologous sequences. 

Derived from is used herein to mean taken, 
obtained, received, traced, replicated or descended 
from a source (chemical and/or biological). A 
derivative may be produced by chemical or biologi- 
cal manipulation (including but not limited to sub- 
stitution, addition, insertion, deletion, extraction, 
isolation, mutation and replication) of the original 
source. 

Chemically synthesized , as related to a sequence 
of DNA, means that the component nucleotides 
were assembled in vitro . Manual chemical synthesis 
of DNA may be accomplished using well established 
procedures (M. Caruthers (1983) in Methodology of 
DNA and RNA Sequencing , Weissman (ed.), Praeger 
Publishers (New York) Chapter 1), or automated 
chemical synthesis can be performed using one of a 
number of commercially available machines. 

Heat shock elements refer to DNA sequences that 
regulate gene expression in response to the stress 
of sudden temperature elevations. The response is 
seen as an immediate albeit transitory enhancement 
in level of expression of a downstream gene. The 
original work on heat shock genes was done with 
Drosophila but many other species including plants 
(T. Barnett et al. (1980) Dev. Genet. 1:331-340) 
exhibited analogous responses to stress. The 
essential primary component of the heat shock 
elements was described in Drosophila to have the 
consensus sequence 5'-CTGGAAT-TTCTAGA-3' 
and to be located in the region between residues -66 
through -47 bp upstream to the transcriptional start 
site (H. Pelham and M. Bienz (1982) supra ). A 
chemically synthesized oligonucleotide copy of this 
consensus sequence can replace the natural se- 
quence in conferring heat shock inducibility. In other 
systems, multiple heat shock elements were identi- 
fied within the promoter region. For example, 
Rochester et al. (1986) supra recognized two heat 
shock elements in the maize hsp 70 gene. 

Leader sequence refers to a DNA sequence 
comprising about 100 nucleotides located between 
the transcription start site and the translation start 
site. Embodied within the leader sequence is a 
region that specifies the ribosome binding site. 

Introns or intervening sequences refer in this work 
to those regions of DNA sequence that are tran- 
scribed along with the coding sequences (exons) 
but are then removed in the formation of the mature 
mRNA. Introns may occur anywhere within a tran- 



scribed sequence - between coding sequences of 
the same or different genes, within the coding 
sequence of a gene, interrupting and splitting its 
amino acid sequences, and within the promoter 

5 region (5' to the translation start site). Introns in the 
primary transcript are excised and the coding 
sequences are simultaneously and precisely ligated 
to form the mature mRNA. The junctions of introns 
and exons form the splice sites. The base s quence 

10 of an intron begins with GU and ends with AG. The 
same splicing signal is found in many higher 
eukaryotes. 

The present invention relates to the development 
of a recombinant vector useful for the expression of 

15 DNA coding segments in plant cells. The vector 
herein described employs a maize ubiquitin pro- 
moter to control expression of an inserted DNA 
coding segment. The transcriptional regulatory se- 
quences may be combined with an extrachromoso- 

20 mal replication system for a predetermined host. 
Other DNA sequences having restriction sites for 
gene insertion may be added to provide a vector for 
the regulated transcription and translation of the 
inserted genes in said host. The vector may also 

25 include a prokaryotic replication system allowing 
amplification in a prokaryotic host, markers for 
selection and other DNA regions. This would allow 
large quantities of the vector to be grown in well 
characterized bacterial systems prior to transfor- 

30 rning a plant or mammalian host. The principles for 
construction of a vector having proper orientation of 
the promoter and coding sequences with respect to 
each other are matters well-known to those skilled in 
the art. In some situations it may be desirable to join 

35 the promoter system to a desired structural gene 
and to introduce the resultant construct DNA 
directly into a host. Methods for such direct 
transfers include, but are not limited to, protoplast 
transformation, electroporation, direct injection of 

40 DNA into nuclei and co-transformation by calcium 
precipitation. 

This invention comprises the first report of an 
isolated and characterized plant ubiquitin promoter. 
The maize ubiquitin promoter as described in the 

45 present work includes the RNA polymerase recogni- 
tion and binding sites, the transcriptional initiation 
sequence (cap site), regulatory sequences respon- 
sible for inducible transcription and an untranslat- 
able intervening sequence (intron) between the 

50 transcriptional start site and the translational initia- 
tion site. Two overlapping heat shock consensus 
promoter sequences are situated 5' (-214 and -204) 
of the transcriptional start site. An exon of 83 
nucleotides is located immediately adjacent to the 

55 cap site and is followed by a large (approximately 
1kb) intron. 

The ubiquitin promoter along with the ubiquitin 
structural gene can be isolated on two approxi- 
mately 2kb Pst fragments of the maize genome 

60 (Figure 1). The entire fragment can be used to show 
promoter function by monitoring expression of 
mRNA or protein. Introduction of a heterologous 
gene downstream of the ubiquitin translation initia- 
tion codon will result in the expression of a fused 

65 protein. Insertion of a h terologous gene (having its 
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own start and stop codons) between the ubiquitin 
promoter and translation initiation codon wiii result 
In the expression of the native polypeptide corre- 
sponding to the inserted gene. The ins rtion of the 
desired structural gene is most conveniently accom- 
plished with the use of blunt-ended linkers at the 
ends of the gene. 

Alternatively, the ubiquitin gene fragment may be 
restricted, particularly at a site immediately preced- 
ing the start of the structural gene or at a site 
preceding the transcription start site. For example, 
in the present invention the promoter fragment was 
derived from the ubiquitin gene as an approximately 
2kb Pstl fragment. To ensure that the promoter 
fragment is devoid of the translational initiation 
codon, the fragment containing the 5' flanking region 
may be selectively digested with double stranded 
exonuclease under controlled conditions to remove 
a desired number of nucleotide pairs. It is desirable 
to remove the ubiquitin translation initiation codon 
so that translation of the inserted gene will com- 
mence at its own start site. The isolated (and 
shortened) promoter fragment may then be inserted 
into the vector using linkers or homopolymer tailing 
to introduce desired restriction sites compatible 
with the remaining regions of the vector. In general, 
the promoter fragment may be cleaved with specific 
restriction enzymes and the resultant shortened 
DNA fragments tested for promoter function and 
compared to that of the intact promoter. In addition, 
DNA codons may be added and/or existing sequen- 
ces may be modified to give derivative DNA 
fragments retaining promoter functions. 

The resulting DNA constructs may be useful as 
cloning vehicles for a structural gene of interest in a 
plant host. In this invention, the structural gene 
encoding CAT under control of either the maize 
ubiquitin promoter or the cauliflower mosaic virus 
promoter was expressed in both oat and tobacco 
cells. When the ubiquitin promoter was employed, a 
greater degree of expression was obtained with the 
monocot host than with the dicot host; however, a 
higher level of expression was obtained with dicot 
than with monocot host when the cauliflower mosaic 
virus promoter was utilized. The differential in 
expression levels reflects the inherent inequality of 
different promoters as well as basic cellular differen- 
ces in regulation of expression and processing 
between monocots and dicots. To date, it is not 
predictable, routine or obvious that a monocot 
promoter will operate in a dicot host cell. 

A wide variety of structural genes may be 
introduced into the subject DNA cloning vectors for 
the production of desired proteins, such as 
enzymes, hormones and the like. In addition, DNA 
constructs of this type can be used for the enhanced 
production of DNA derived form a particular gene, as 
well as for enhanced production of mRNA which can 
be used to produce cDNA. Such vectors carrying 
specific DNA sequences find wide application and 
are quite versatile; for example, they can be used for 
amplification in bacteria as well as for expression in 
higher cells which allow for additional cellular 
functions. An advantage of utilizing higher euka- 
ryotic recombinant systems to produce commer- 
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cially medical and agriculturally desirable proteins is 
that they ensure correct post-trans lational modifica- 
tions which may otherwise be difficulty to duplicate 
in prokaryotic and lower eukaryotic hosts. 
5 In this invention the maize ubiquitin promoter was 
shown to function in oat and tobacco, as examples 
of monocots and dicots, respectively, and it is 
conceivable that this promoter can function in yet 
other cells. Such systems include, by way of 

10 example, and without limitation, other cells from 
which ubiquitin genes have bene isolated and found 
to be highly conserved, for example, other monocots 
in addition to maize, dicots other than tobacco, 
lower eukaryotic organisms such as yeast and 

15 mammalian cells. The screening of cellular systems 
suitable for use with the maize ubiquitin promoter 
can be accomplished according to the teaching 
herein, without undue experimentation. The con- 
struction of vectors suitable for the expression of a 

20 DNA coding segment in individual systems has been 
well documented. Shuttle vectors capable of replica- 
tion in more than one host have also been described, 
for example, shuttle expression vectors for both 
yeast and mammalian cells, for plants and animal 

25 cells and for plants and bacterial cells. In addition, it 
will be understood that ubiquitin genes from any 
other system, that are similar to the maize ubiquitin 
gene in functioning as a plant promoter, may be 
employed as the source for the ubiquitin promoter 

30 sequence. 

The present invention also relates to the utilization 
of the maize ubiquitin promoter as a heat shock 
promoter. Two heat shock consensus sequences 
are located upstream of the maize ubiquitin gene at 

35 positions -214 and -204. In many eukaryotes, 
naturally occurring and chemically-synthesized se- 
quences homologous to the heat shock consensus 
sequence have been shown to regulate the induc- 
tion of gene expression. Although the ubiquitin 

40 promoter contains sequences that are identified as 
being those of heat shock elements, the promoter is 
distinguished from classical heat shock promoters 
(1) in having a nontranslated intron 3' to the 
transcription start site and (2) in regulating ubiquitin 

45 expression constitutively as well as inductively. The 
functional relationship between heat shock ele- 
ments and the presence of a large intron within th 
promoter region is unknown to prior art. The 
nucleotide distance between these characteristic 

50 features and also the directionality and orientation of 
one element with respect to the other are presumed 
in the present work to be variable, as long as the 
basic promoter function of the derivative regulatory 
fragments remains active. 

55 The presence of an intron in the promoter region 
has been related to the relative stability of the 
unprocessed mRNA transcript and, indirectly, to the 
level of protein synthesized (Callis et al. (1987) 
Genes and Development 1:1183-1200). Constitu- 

60 tively expressed ubiquitin mRNA has been reported 
to be maintained at stable levels in chicken embryo 
fibroplasts, whereas ubiquitin mRNA formed in 
response to stress has a half-life of approximately 
1.5 to 2 h. 

65 In yeast four distinct ubiquitin-coding loci have 
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been described. Constitutively express d ubiquitin 
is encoded by one or more of three of the ubiquitin 
genes, two of which contain an approximately 400 bp 
intron immediately within the coding region. The 
fourth ubiquitin gene, devoid of a nontranslated 
intron but comprising multiple heat shock elements, 
functions primarily in inducing ubiquitin xpression 
in response to stress. It has been shown that the 
latter ubiquitin gene does not act constitutively but 
rather is turned on in response to heat shock or 
stress signal (E. Ozkaynak et aL (1987) EMBO J. 
6:1429-1439). 

In maize, ubiquitin is encoded by a small multigene 
family. In this invention is presented the nucleotide 
sequence of one of the ubiquitin genes. A large 
(approximately 1 kb) intron between the transcrip- 
tional and the translational start sites as well as 
nucleotide sequences corresponding to consensus 
heat shock sequences are found within the maize 
ubiquitin promoter region. These two regions of 
specialization most probably are involved in ubiquitin 
synthesis and in regulating the ubiquitin level in 
response to external influences. The functional 

miotiAnehin hot\»/oon the intrnn and thft hpat shonk 

, .w. - - — - - 

elements encompassed within the ubiquitin pro- 
moter is unknown. It is reported in this invention that 
the maize ubiquitin promoter regulates the synthesis 
of mRNA both under normal and under heat shock 
conditions and that changes in the regulation of 
transcription account for the enhancement in ubi- 
quitin synthesis after heat shock. 

The following examples are offered by way of 
illustration and not by way of limitation. 

EXAMPLES 

Example 1 : Isolation and Characterization of the 
Maize Ubiquitin Gene 

A. Growth of Plants 

Zea mays Inbred line B73 was grown in moist 
vermiculite for 4 to 5 days at 25° C in the dark. The 
portion of the seedlings from the mesocotyl node to 
the shoot tip was harvested, frozen in liquid nitrogen 
and stored at -80° C. 

B. RNA isolation and Analysis 

Total cellular RNA was extracted from frozen 
tissue using the guanidine thiocyanate procedure. 
Poly(A) -f RNA was isolated from total cellular RNA 
by passage over a poly U-Sephadex (Bethesda 
Research Laboratories, Galthersburg, MD) column. 
Total or poly(A)+ RNA was electrophoresed in 
1 .5% agarose gels containing 3<>/o (wt/vol) formalde- 
hyde. RNA was transferred to Gene Screen™ 
(DuPont) by capillary blotting using 26 mM sodium 
phosphate (pH6.5). 

Blots were prehybridized in 50<>/o formamide, 
5XSSC. 100jig denatured salmon DNA, 40mM 
sodium phosphate (pH6.8). 0.50/o BSA and 1% SDS. 
Blots were hybridized in 60% formamide, 6XSSC, 
10Gug/ml denatured salmon DNA, 40m M sodium 
phosphate 9pH6.8) and 100/o dextran sulfate. 



C. cDNA Library Construction 

Double stranded cDNA was synthesized from 
poly(A) + RNA by a modification of the method of 
Murray et aJ. (1983) Plant Mol. Biol. 2:75-84. 

5 Oligo(dC)-tailing of the double-stranded cDNA and 
annealing of oligo(dC)-tail d cDNA with 
oligo(dG)-tail dpBR322w rep rform d using stan- 
dard technology. The annealed DNA was trans- 
formed into E. coli HB101 and plated directly onto 

10 nitrocellulose filters (Millipore, HATF; 0.45 urn) on 
L-agar plates containing tetracycline (15 u.g/ml). 

D. Identifications of Ubiquitin cDNA 

A number of cDNAs representing potentially 

15 light-regulated mRNAs were obtained by scr ening 
a cDNA library by differential hybridization. Several 
of these cDNAS were selected and further sere ned 
by RNA blot analysis to confirm light regulation. One 
cDNA clone, p6R7.2b1, while not representing a 

20 red-light regulated mRNA, was of interest because it 
hybridized with three poly(A)+ RNAs of different 
size and abundance. Nick translated p6T7.2b1 
hybridized strongly with the 2100 nucleotide and 
1600 nucleotide mRNAs. but only weakly with the 

25 800 nucleotide transcript. However, hybridization of 
Northern blots with a single stranded 32 p-lab led 
RNA generated by SP6 polymerase transcription of 
linearized pCA210, a plasmid constructed by sub- 
cloning the cDNA insert of p6T7.2b1 into pSP64, 

30 readily detected all three transcripts. 

Since RNA-RNA hybrids are known to be mor 
thermally stable than DNA-RNA hybrids, single 
stranded RNA probes rather than nick translated 
DNA probes were used in Northern blot hybridiza- 

35 tions. Again, the 1600 base transcript was found to 
be about 3 fold less abundant than the 2100 base 
transcript as determined from Northern blots, r - 
gardless of whether the blot was hybridized with nick 
translated DNA or single strand RNA probes. Th 

40 smallest transcript was about half as abundant as 
the 2100 base mRNA in blots hybridized with RNA 
probes. 

Restriction fragments were subcloned into 
M13mp18 and/or mp19 and sequenced by the 

45 dideoxynucleotide chain termination method. Ana- 
lysis of the sequence of the clone revealed a single 
long open reading frame of 818 bp terminating in a 
TAA stop codon. The National Biomedical Research 
Foundation library was searched using the D fast P 

50 program for protein sequences homologous with the 
deduced amino acid sequence. Greater than 95% 
homology was found between the deduced amino 
acid sequence of the maize cDNA clone and th 
sequences of bovine and human ubiquitin. 

55 

E. Genomic Library Construction and Screening 
High molecular weight maize DNA was isolated 

from frozen maize seedling. DNA was partially 
digested with Sau3A, size fractionated and cloned 

60 into the Bam Hl sites of Charon 35 (Loenen et al. 
(1983) Gene 2£:171-179). A library of about 2X10 6 pfu 
was screened for recombinant phage containing 
sequences homologous to the ubiquitin cDNA clone 
by in situ plaque hybridization using a ubiquitin 

65 cDNA clone as a hybridization probe. Recombinant 
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phage were purified from broth lysates and phage 
DNA was isolated using standard techniques. Re- 
striction endonuclease digestions were carried out 
according to manufacturers' specifications. 

F. Genomic Southern Blot Analysis 

Isolated, high molecular weight maize DNA was 
digested with EcoRI, Hindlll and Sac l, fractionated 
on O.70/0 agarose gels and the DNA fragments were 
transferred to Gene Screen Plus™ (DuPont). Filters 
were prehybridized for 6-8h at 65° C in 6XSSC 
(1XSSC = 0.15M NaCI, 0.025M Na Citrate), 5X 
Denhardt's medium, 100 p.g/ml denatured, soni- 
cated Salmon DNA, 20 u.g/ml polyadenylic acid, 10 
mM disodium EDTA and 0.5% SDS. Filters were 
hybridized at 65° C in fresh buffer with 32 p labeled 
plasmid DNA (pCA210). Autoradiography was car- 
ried out at -80°C using Kodak X-OMAT AR Film and 
one DuPont Cronex LightningPius intensifying 
screen. In each digest, 8 to 10 restriction fragments 
hybridized with the nick translated pCA210 probe, 
suggesting that ubiquitin is coded by a small 
multigene family. Evidence that ubiquitin is encoded 
by a small multigene family has also been reported 
for Xenopus, barley and yeast. 

Two or three fragments in each digest hybridized 
strongly with the probe, whereas the remainder of 
the fragments hybridized weakly. The differences in 
hybridization intensities may reflect different se- 
quence homology such that the cDNA probe 
hybridizes preferentially to the gene from which it 
was derived. 

Ubiquitin genes from yeast and Xenopus have 
been characterized and have six and at least twelve 
ubiquitin repeats, respectively. Maize genes corre- 
sponding to the three transcripts detected on 
Northern blots may have seven, five and one or two 
ubiquitin repeats in the 2.1, 1.6 and 0.8 kb mRNAs, 
respectively. The maize ubiquitin gene described in 
this invention codes for seven repeats. Thus, the 
difference in hybridization intensity observed on 
Southern blots may be a result of the restriction 
fragments containing a different number of ubiquitin 
repeats. 

The ubiquitin cDNA clone did not contain Eco RI 
and Hindlll sites. However, the maize ubiquitin genes 
may contain introns which are cut by the restriction 
endonucleases used in the genomic digests. This 
could result in ubiquitin exons being on different 
fragments and could account for the differential 
hybridization intensities observed in the Southern 
blots. 

G. Ubiquitin Sequence Analysis and Transcription 
Start Site Analysis 

Dideoxynucleotide chain termination sequencing 
was performed using Klenow fragments of DNA 
polymerase 1 (Boehringer Mannheim). A 1 .85 kb Pst l 
fragment of the genomic clone 7.2bl (see Figure 1b) 
homologous to the cDNA clone p6T7.2b.1 and the 2 
kb Pstl fragment immediately upstream, termed 
AC3#9M13RF, were subcloned in both orientations 
into M13mp19. Recombinant phage RF DNA was 
prepared as for plasmid DNA. Unidirectional pro- 
gressive deletion clones for sequencing both 



strands of these Pstl fragments were prepared. 
Exonuclease III and Exonuclease VII were obtained 
from New England Biolabs and Bethesda Research 
Laboratori s, respectively. Computer analysis of 
5 DNA sequences was perform d using programs 
made available by the University of Wisconsin 
Genetics Computer Group. 

The transcription start site of the ubiquitin gene 
and the 3' junction of the intron and exon in the 5' 
10 untranslated region of the gene were determined by 
S1 nuclease mapping. Fragments suitable for S1 
probes were prepared as follows. The ubiquitin DNA 
was digested with either Bglll or Xho l. These were 
then labeled with 32 p using - 32 P ATP (6000 Ci/mmol, 
15 New England Nuclear, Boston, MA) and T4 polynu- 
cleotide kinase (New ENgiand Biolabs). Subsequent 
digestion of the Bgj l and Xho l kinased fragments 
with Pstl and EcoRI, respectively, generated a 946 
bp Pstl-Bglll fragment and a 643 bp EcoRl-Xhol 
20 fragment. These fragments were separated from the 
other end-labeled fragments by electrophoresis 
through a 50/0 polyacrylamide gel. Slices containing 
the 946 bp Pstl-Bglll and the 643 pb EcoRl-Xhol 
fragments were cut out of the gel and the labeled 
25 DNAs were eluted from the gel. End-labeled DNA 
fragment (10-20 fmole) was hybridized with 2 jig of 
poly(A)+ RNA in 30 \i\ of buffer containing 8W0 
deionized formamide, 0.4M sodium chloride, 40mM 
PIPES 9pH6.4) and 1mM EDTA (pH8.0). The nucleic 
30 acid solution was heated to 80° C for about 16h. 
Ice-cold S1 digestion buffer (300uJ) containing 
280mM sodium chloride, 50mM sodium acetate 
(pH4.6), 4.5mM zinc sulfate and 20 ug/ml single 
stranded DNA was added and the DNA digested 
35 with 250 units/ml of S1 nuclease (New England 
Nuclear). The reaction was topped with 75jil of S1 
termination mix containing 2.5M ammonium acetate 
and 50mM EDTA. The products of the St nuclease 
digestion were then separated on a 60/0 polyacryla- 
40 mide/8M urea gel and visualized by autoradiography. 
The end points of the S1 protected fragments in the 
ubiquitin sequence were determined by comparison 
with a sequence ladder generated by Maxam/Giibert 
base modification-cleavage reactions carried out on 
45 the end labeled fragments used as S1 probes. 

The DNA sequence of the maize ubiquitin-1 gene, 
7.2b1, is shown in Figure 2. The sequence is 
composed of 899 bases upstream of the transcrip- 
tion start site, 1992 bases of 5' untranslated and 
50 intron sequences, and 1999 bases encoding seven 
ubiquitin protein repeats preceding 249 bases of 3' 
sequence. A "TATA* box is located at -30 and two 
overlapping heat shock elements are located at -214 
and -240. The DNA sequence of the coding and 3' 
55 regions of the ubiquitin-1 gene from maize, 7.2b1, is 
also presented if Figure 3. The derived amino acid 
sequence of maize ubiquitin is shown at the top and 
the nucleotide sequence of the seven ubiquitin 
repeats is aligned underneath. A schematic of the 
60 organization of the seven complete ubiquitin units in 
the genomic DNA is shown in Figure 1C. 

The derived amino acid sequences of all of the 
ubiquitin repeats are identical (Figure 3). The 
terminal (seventh) ubiquitin repeat contains an 
65 additional 77th amino acid, glutamine, prior to the 
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TAA stop codon. This additional amino acid is not 
found in mature ubiquitin, and is apparently removed 
during processing. The 77th amino acid of the final 
repeat in the human gene is valine, while in the two 
chicken genes, it is tyrosine and asparagine. Yeast 
and barley also have an extra amino acid, asparagine 
and lysine, respectively; however, an extra amino 
acid was not found in the Xenopus gene. This extra 
amino acid has been proposed to function as a block 
to conjugation of unprocessed polyubiquitin to 
target proteins. A polyadenyiation signal (AATAAT) 
is present in the 3' untranslated sequence, 113 bp 
from the stop codon. 

All seven repeats encode the identical amino acid 
sequence, whereas the nucleotide sequence of the 
repeats varies by as many as 39 nucleotides. This is 
similar to what has been reported for the nucleotide 
sequence homologies between ubiquitin genes. 
About 80% of the nucleotide mismatches between 
ubiquitin repeats are at the third (wobble) base in the 
codon. Alternate codon usage for leucine (5 co- 
dons), serine (3 codons) and arginine (3 codons) 
account for the remaining nucleotide mismatches. 

The amino acid sequence for maize ubiquitin is 
identical to that determined for two other higher 
plants, oat and barley. The sequence differs from the 
sequence reported for yeast by two amino acids: 
alanine for serine substitutions at positions 28 and 
57. The maize sequence is also slightly different form 
that reported for ubiquitin from all animals; substitu- 
tions by serine for proline at position 19, aspartate 
for glutamate at position 24 and alanine for serine at 
position 57. Thus, based on sequence, there appear 
to be three types of ubiquitin: plant, animal and 
yeast. 

Example 2: Construction of plasmid pUB-CAT 
comprising the maize ubiquitin promoter and a 
structural gene 

A. Promoter isolation and construction of pUB-CAT 

The procedure used for construction of the 
ubiquitin gene upstream region-chloramphenicol 
acetyl transferase (CAT) gene fusion is outlined in 
Figure 4. The BamHI-Hindlll restriction fragment 
containing the CAT gene and the nopaline synthase 
(NOS) 3' untranslated region and polyadenyiation 
signal of pNOS-CAT (Fromm et al. (1985) Proc. Natl. 
Acad. Sci. 82:5824-5828) was subcloned into Bam HI 
and Hindlll digested pUCia. This construct was 
termed pUB-CAT. 

An approximately 2.0kb Pstl fragment immediately 
upstream of the ubiquitin polyprotein coding region 
of the maize ubiquitin gene 7.2bl was subcloned into 
M13mp19. This segment of DNA spans nucleotides 
-899 to 1092 of the maize ubiquitin sequence 
documented in Figure 2. This recombinant DNA was 
termed AC3#9M13RF and contains the ubiquitin 
promoter, 5' untranslated leader sequence and 
about 1kb intron, labeled UBI-5' in Figure 4. 

The ubiquitin promoter-CAT reporter gene fusion 
was constructed by blunt ending with T4 DNA 
polymerase the 2.0kb Pstl fragment of 
AC3#9M13RF and cloning this fragment into Smal- 
digested pUC18-CAT. This construct was termed 



pUB-CAT. 

B. Introduction of Recombinant DNA into Oat and 
Tobacco Protoplast 
5 Leaves (2g) of 5- to 6-day old etiolated oat 
seedlings were finely chopped with a razor blade. 
The tissue was rinsed several times with digestion 
medium (3mM MES, pH5.7, 10mM calcium chloride, 
0.5M mannitol and 2 mg/ml arginine) and th n 
10 incubated for 4h at room temperature with 20 ml 
digestion medium containing 2% cellulase. The 
tissue was shaken occasionally to release proto- 
plasts. The material was filtered through a 63 urn 
mesh and centrifuged 5 min at 50xg. The superna- 
15 tant fluid was removed and the protoplasts were 
washed two times with digestion medium and then 
resuspended in electroporation buffer to give 0.5 ml 
of protoplast suspension per electroporation. The 
electroporation buffer consisted of: 10mM HEPES, 
20 pH7.2, 150mM sodium chloride, 4mM calcium 
chloride and 0.5M mannitol. 

Protoplasts (0.5ml) in electroporation buffer wer 
mixed on ice with 0.5ml of electroporation buffer 
containing 4Cji.g plasmid DNA ptuc 100 jig sonicated 
25 salmon DNA. The protoplasts were electroporated 
on ice with a 350 volt, 70msec pulse. The protoplasts 
were incubated another 10 min on ice, then diluted 
into 10ml Murasige-Skoog (MS) medium and incu- 
bated at room temperature for 24h. 
30 Protoplasts were pelleted by centrifugation for 5 
min at 50 xg. The supernatant fluid was removed and 
the protoplasts washed once with MS medium. The 
protoplast pellet was resuspended in 200uJ Buffer A 
(0.25M Tris, pH7.8, 1mM EDTA, 1mM P-mercap- 
35 toethanoi) and transferred to a microcentrifuge tube. 
Protoplasts were disrupted by sonication for 5-10 
sec at the lowest setting. Protoplast debris was 
pelleted by centrifugation for 5 min at 4°C. The 
supernatant fluid was removed, heated to 65° C for 
40 10 min and stored at -20° C. 

C. Assay for CAT activity in transformed protoplasts 
Aliquots (100u.l) of the electroporated protoplast 
extract (extract of cells transformed with recombi- 

45 nant DNA) were added to 80uJ of Buffer A and 20u,l of 
a mix of 20ul 1 ^-chloramphenicol (40-60 mCi/mM), 
2mg acetyl CoA and 230jil Buffer A. The reaction was 
incubated for 90 min at 37° C. The reaction products 
were extracted with 600uJ ethyl acetate and were 

50 concentrated by evaporating the ethyl acetate and 
resuspending in 10pJ ethyl acetate. The reaction 
products were separated by thin layer chromato- 
graphy using chloroform .methanol (95:5,v/v) solvent 
and were detected by autoradiography. 

55 Transformation of host cells was determined by 
measuring the amount of enzymatic activity ex- 
pressed by the structural gene contained within the 
promoter-gene fusion construct. In this example, the 
structural gene encoding chloramphenicol acetyl 

60 transferase was employed in the DNA construct. To 
test the efficacy of the promoter utilized in the 
recombinant DNA fusion construct, parallel electro- 
porations were carried out, utilizing either the maize 
ubiquitin promoter-CAT gene fusion pUOCAT (de- 

65 scribed herein and in Figure 4) and pCaMV-CAT, a 



11 



BNSDOCID: <EP 0342926A2J_> 



21 



EP 0 342 926 A2 



22 



cauliflower mosaic virus 35S promoter-CAT gene 
fusion (Fromm et a!. (1985) Proc. Natl. Acad. Sci. 
USA 82:5824-5828) obtained from V. Walbot, Stan- 
ford University. As illustrated in Figure 5, in oat 
protoplasts the ubiquitin promoter is "stronger" than 
the CaMV promoter, as judged by the amount of 
enzymatic activity expressed. 

Example 3: Heat Shock Response 

A. Heat Shock Treatment 

To heat shock, 4- to 5-day old etiolated seedlings 
were transferred to an incubator at 42° C and 
harvested 1, 3 and 8h after transfer. Total RNA (7u,g) 
was isolated, denatured and electrophoresed 
through a 1.50/0 agarose 30/o formaldehyde gel. The 
RNA was transferred to Gene Screen and probed 
with single stranded RNA transcribed from li- 
nearized pCA210 using SP6 RNA polymerase. (The 
recombinant plasmid, pCA210, was constructed by 
subcloning the 975 bp insert of p6R7.2b1 into pSP64 
(Promega) so that SP6 RNA polymerase syn- 
thesized an RNA probe specific for hybridization 
with ubiquitin mRNA.) After autoradiography, the 
bands were cut out and the amount of radioactivity 
bound to the filter was determined by liquid 
scintillation. From analysis of the Northern blots, 
levels of three ubiquitin transcripts were determined. 

One hour after transfer to 42° C, the level of the 
2.1kb transcript increased 2.5 to 3 fold. An approxi- 
mately 2 fold increase was observed for the 1.6kb 
transcript, however, no increase was seen for the 
0.8kb transcript. By three hours after transfer of the 
seedlings to elevated temperature, the levels of the 
two largest ubiquitin transcripts had returned to the 
level observed in unshocked tissue and remained at 
those levels for at least another five hours. The 
transitory nature of ubiquitin during the heat shock 
response in maize may indicate the ubiquitin has a 
specialized role in heat shock and that only brief 
periods of increased levels of ubiquitin are required. 

B. Heat Shock Sequences 

The nucleotide sequence of the maize ubiquitin 
gene is presented in Figure 2. Within the promoter 
region are nucleotide sequences homologous to the 
consensus heat shock sequence that has been 
shown to confer stress inducibility when placed 
upstream of heterologous promoters (Pelham 
(1982) supra ). The consensus sequence for the 
Drosophila heat shock element is 
5'-CTG G A AT TTCTAGA-3' 

and is generally found approximately 26 bases 
upstream of the transcriptional start site. 

Located within 900 bases 5' to the transcriptional 
start site of the maize ubiquitin promoter are two 
overlapping heat shock sequences: 
5'-CTGGA CCCCTCTCGA-3' starting at nucleotide 
-214, and 

5'-CTCGA GAGTTCCGCT-3' starting at nucleotide 
-240. The ubiquitin promoter from chicken embryo 
fibroblasts was also found to contain two overlap- 
ping heat shock consensus promoter sequences: 
5'-CTCGA ATCTTCCAG-3' starting at nucleotide 
-369, and 



5'-CCAGA G CTTTCTTTT-3' starting at nucleotide 
-359. The 5' flanking region of the yeast ubiquitin 
gene UB14 (E. Ozkaynak et al. (1987) supra ) 
comprises an 18 kb, rotationally symmetric (palind- 
5 romic) sequence, 5'-T TCTAGAACGTTCTAG AA-3', 
365 bases upstream of the translation start site. The 
middle 14 bases (underlined) of this 18 bp sequence 
contain an exact homology to the rotationally 
symmetric consensus 'heat shock box* nucleotide 

10 sequence starting at approximately 284 nucleotides 
upstream of the presumed transcription start site. 

The relative position of the heat shock sequence 
with respect to the transcriptional initiation codon 
and its ultimate consequence on the magnitude of 

15 the induction response to heat shock or other stress 
remains largely unknown, although it has been 
suggested (U. Bond et al. (1986) supra ) that the 
further a heat shock element is located 5' from the 
transcriptional start site, the smaller is the level of 

20 induction in response to stress. 

In this invention it is assumed that a heat shock 
sequence may be arbitrarily positioned at different 
loci within the ubiquitin promoter and that it may be 
chemically altered in sequence or be replaced with a 

25 synthetic homologous sequence, so long as the 
modified promoter sequence retains ubiquitin pro- 
moter function, which comprises the initiation, 
direction and regulation of transcription under stress 
and non-stress conditions. Biochemical techniques 

30 required to insert and/or delete nucleotides and to 
manipulate the resultant DNA fragments are well- 
known to those skilled in the art of genetic 
redesigning. 

35 Example 4: Presence of Heat Shock Sequence(s) 
and a Large Intron Within the Ubiquitin Promoter 

The ubiquitin promoter from maize is charac- 
terized structurally by the presence of two overlap- 
ping heat shock sequences approximately two 

40 hundred bp upstream of the transcriptional start sit 
and that of a large (approximately 1kb) intron 
between the transcriptional start site and the 
translational initiation codon. This promoter struc- 
ture is very similar to that reported (U. Bond et al. 

45 (1986) supra ) for the ubiquitin promoter from 
chicken embryo fibroblasts in which two overlapping 
heat shock sequences are located approximately 
350 bp upstream of the transcriptional start site and 
a 674 bp intron is contained between the transcrip- 

50 tional and translational initiation codons. Recently 
(E. Ozkaynak et al. 91987) supra ), the nucleotide 
sequence of the promoter region from yeast ubi- 
quitin UB14 gene was determined and found to 
contain a heat shock sequence approximately 280 

55 bp upstream of the transcriptional start site, but this 
yeast ubiquitin promoter was devoid of a large intron 
between the transcription and translation initiation 
sites. However, two other yeast ubiquitin genes, 
which did contain introns, were found to be lacking 

60 sequences homologous to the Pelham "heat shock 
box" sequence. 

Ubiquitin promoters have been shown to up-regu- 
iate expression of ubiquitin in response to heat 
shock in yeast, chick n embryo fibroblasts and 

65 maize. In all three systems, the level of ubiquitin 
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mRNA is elevated after heat shock treatment and 
the increase in ubiquitin level was determined in 
maize and chicken embryo fibroblasts to be approxi- 
mately 3 fold. This enhancement in ubiquitin ex- 
pression in response to heat shock is significantly 
I ss than that obtained with ther heat shock g nes. 
It was found in chicken embryo fibroblasts that th 
levels of ubiquitin mRNA in cells exposed to 45° C 
increase by 2.5 fold over a 2.5h period, whereas the 
levels of HSP70 mRNA increased 10 fold under the 
same heat shock conditions. Moreover, the relative 
instability of ubiquitin mRNA during recovery of cells 
from a 3h heat shock (half-life of approximately 1 .5 to 
2h) was also found to differ significantly from that of 
HSP70 mRNAs which were found to be stable. 

It is interesting to note that in contrast to ubiquitin 
promoters, HSP70 genes do not contain large 
introns between the transcriptional and translational 
initiation codons. Another difference between the 
ubiquitin promoter and other heat shock promoters 
is that ubiquitin is expressed both constitutively and 
inductively, whereas expression of classical heat 
shock proteins occurs predominantly in response to 
heat shock or other stress. This invention allows 
skilled workers knowledgeable in the art to modify 
ubiquitin promoter with respect to the composition 
sequence and position of both the intron and the 
heat shock sequences in order to alter constitutive 
and/or inductive expression of ubiquitin. Also, 
standard recombinant technology may be employed 
to reposition, as well as to chemically alter the 
nucleotide sequences within the maize ubiquitin 
promoter region in such a fashion as to retain or 
improve the promoter function of the resultant 
modified DNA. Testing tor ubiquitin promoter func- 
tion may be carried out as taught in Example 2. 



9. A DNA sequence as in claim 1 wherein said 
heat shock element is located upstream of the 
transcription start site and wherein said intron 
is located downstream of the transcription start 
5 site. 

10. A DNA construct comprising : 

(a) a DNA s qu nee no larg r than 2 kb, 
said DNA sequence comprising a plant 
ubiquitin regulatory system, wher in said 

10 regulatory system contains a heat shock 

element and an intron, and 

(b) a plant-expressible structural gene 
wherein said structural gene is placed 
under the regulatory control of said plant 

15 ubiquitin regulatory system. 

11. A DNA construct as in claim 10 wher in 
said structural gene codes for a gene that is not 
under heat shock control in nature. 

12. A method for the constitutive expression of 
20 a structural gene and the selected stress-in- 
duced enhancement in expression of said 
structural gene in a plant cell comprising the 
steps of: 

(a) transforming said plant cell with a 
25 DNA construct comprising a plant ubiquitin 

regulatory system, wherein is found a h at 
shock element and an intron, and a 
plant-expressible structural gene that is 
under the regulatory control of said plant 
30 ubiquitin regulatory system, and 

(b) selectively applying stress condi- 
tions to said transformed plant cell thereby 
inducing enhancement in expression of 
said structural gene. 



Claims 



40 



1. A DNA sequence no larger than 2 kb, said 
DNA sequence comprising a plant ubiquitin 
regulatory system, wherein said regulatory 
system contains a heat shock element and an 
intron. 45 

2. A DNA sequence as in claim 1 wherein said 
heat shock element is at least 750/o homologous 
to the heat shock consensus sequence. 

3. A DNA sequence as in claim 1 wherein said 

heat shock element is not more than approxi- 50 
mately 214 nucleotides 5' of the transcription 
start site. 

4. A DNA sequence as in claim 3 wherein said 
regulatory system contains two heat shock 
elements. 55 

5. A DNA sequence as in claim 4 wherein said 
heat shock elements are situated so as to 
overlap. 

6. A DNA sequence as in claim 1 wherein said 
intron is approximately 1 kb in length. 60 

7. A DNA sequence as in claim 6 wherein said 
intron is that of the maize ubiquitin promoter. 

8. A DNA sequence as in claim 6 wherein said 
Intron is located not more than 83 nuci otides3' 

of the transcription start site. 65 
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